A Comparative Study of Reliable Error Estimators for Pruning Regression Trees
نویسنده
چکیده
This paper presents a comparative study of several methods for estimating the true error of tree-structured regression models. We evaluate these methods in the context of regression tree pruning. Pruning is considered a key issue for obtaining reliable tree-structured models in a real world scenario. The major step of a pruning process consists of obtaining accurate estimates of the error of alternative tree models. We evaluate experimentally four methods for obtaining these estimates in twelve domains. The goal of this evaluation was to characterise the performance of the methods in the task of selecting the best possible tree among the set of trees considered during pruning. The results of the comparison show that certain estimators lead to poor decisions in some domains. The Cross Validation variant that we have proposed achieved the best results on the set-ups we have considered.
منابع مشابه
Error Estimators for Pruning Regression Trees
This paper presents a comparative study of several methods for estimating the true error of tree-structured regression models. We evaluate these methods in the context of regression tree pruning. The study is focused on problems where large samples of data are available. We present two novel variants of existent estimation methods. We evaluate several methods that follow different approaches to...
متن کاملPruning Regression Trees with MDL
Pruning is a method for reducing the error and complexity of induced trees. There are several approaches to pruning decision trees, while regression trees have attracted less attention. We propose a method for pruning regression trees based on the sound foundations of the MDL principle. We develop coding schemes for various constructs and models in the leaves and empirically test the new method...
متن کاملToward a Thorough Approach to Predicting Klinkenberg Permeability in a Tight Gas Reservoir: A Comparative Study
Klinkenberg permeability is an important parameter in tight gas reservoirs. There are conventional methods for determining it, but these methods depend on core permeability. Cores are few in number, but well logs are usually accessible for all wells and provide continuous information. In this regard, regression methods have been used to achieve reliable relations between log readings and Klinke...
متن کاملLiu Estimates and Influence Analysis in Regression Models with Stochastic Linear Restrictions and AR (1) Errors
In the linear regression models with AR (1) error structure when collinearity exists, stochastic linear restrictions or modifications of biased estimators (including Liu estimators) can be used to reduce the estimated variance of the regression coefficients estimates. In this paper, the combination of the biased Liu estimator and stochastic linear restrictions estimator is considered to overcom...
متن کاملA Comparative Analysis of Methods for Pruning Decision Trees
In this paper, we address the problem of retrospectively pruning decision trees induced from data, according to a topdown approach. This problem has received considerable attention in the areas of pattern recognition and machine learning, and many distinct methods have been proposed in literature. We make a comparative study of six well-known pruning methods with the aim of understanding their ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998